Towards Robust Detection of Adversarial Examples
نویسندگان
چکیده
Though the recent progress is substantial, deep learning methods can be vulnerable to the elaborately crafted adversarial samples. In this paper, we attempt to improve the robustness by presenting a new training procedure and a thresholding test strategy. In training, we propose to minimize the reverse cross-entropy, which encourages a deep network to learn latent representations that better distinguish adversarial samples from normal ones. In testing, we propose to use a thresholding strategy based on a new metric to filter out adversarial samples for reliable predictions. Our method is simple to implement using standard algorithms, with little extra training cost compared to the common cross-entropy minimization. We apply our method to various state-of-the-art networks (e.g., residual networks) and we achieve significant improvements on robust predictions in the adversarial setting.
منابع مشابه
Adversarial Deep Learning for Robust Detection of Binary Encoded Malware
Malware is constantly adapting in order to avoid detection. Model based malware detectors, such as SVM and neural networks, are vulnerable to so-called adversarial examples which are modest changes to detectable malware that allows the resulting malware to evade detection. Continuous-valued methods that are robust to adversarial examples of images have been developed using saddle-point optimiza...
متن کاملA Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Examples
Most machine learning classifiers, including deep neural networks, are vulnerable to adversarial examples. Such inputs are typically generated by adding small but purposeful modifications that lead to incorrect outputs while imperceptible to human eyes. The goal of this paper is not to introduce a single method, but to make theoretical steps towards fully understanding adversarial examples. By ...
متن کاملOn the (Statistical) Detection of Adversarial Examples
Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards a mod...
متن کاملDetecting Adversarial Examples - A Lesson from Multimedia Forensics
Adversarial classification is the task of performing robust classification in the presence of a strategic attacker. Originating from information hiding and multimedia forensics, adversarial classification recently received a lot of attention in a broader security context. In the domain of machine learningbased image classification, adversarial classification can be interpreted as detecting so-c...
متن کاملTowards Deep Neural Network Architectures Robust to Adversarial Examples
Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and explore network topology, pre-process...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017